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Abstract 

In the Correlation Clustering, also known as Cluster Editing, we are given an undi- 
rected n- vertex graph G and a positive integer k. The task is to decide if G can be transformed 
into a cluster graph, i.e., a disjoint union of cliques, by changing at most k adjacencies, i.e. by 
adding/deleting at most k edges. We give a subexponential algorithm that 

• in time 2^^^^^ + n^'^^ decides whether G can be transformed into a cluster graph with 
p cliques by changing at most k adjacencies. 

We complement our algorithmic findings by the following tight lower bounds on the asymptotic 
behaviour of our algorithm. We show that unless ETH fails 

• for any constant < cr < 1, there is p = 0(fc°') such that there is no algorithm deciding in 
time 2°(^^^ • nP'^'^'^ whether G can be transformed into a cluster graph with p cliques by 
changing at most k adjacencies. 



1 Introduction 

Correlation clustering, also known as clustering with qualitative information, or cluster editing, is 
the problem to cluster objects based only on qualitative information concerning similarity between 
pairs of items. For every pair of objects, we have an indication if the objects are similar or not. 
The task is to find a partition of the objects into clusters minimizing the amount of similarities 
between different clusters and non-similarities inside of clusters. The problem was introduced 
by Ben-Dor, Shamir, and Yakhini [7J motivated by some problems from computational biology, 
and, independently, by Bansal, Blum, and Chawla [6], motivated by machine learning problems 
concerning document clustering according to similarities. The correlation version of clustering was 
studied intensively, including HI H E O HSl IMl [33] • 

The graph-theoretic formulation of the problem is the following. A graph is a cluster graph 
if every connected component of -fC is a complete graph. Let G = {V, E) be a graph and let 
F dV xV he such that GAF = (V, EAF) is a cluster graph, then F is called a cluster editing set 
for G. Here EAF is the symmetric difference between E and F. In the optimization version of the 
problem the task is to find a cluster editing set of minimum size. Constant factor approximation 
algorithms for this problem were obtained in [I1E1I13]. On the negative side, the problem is known 
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to be NP-complete |33j . Moreover, it was shown by Charikar, Guruswami, and Wirth [H] that the 
problem is APX-hard. 

Giotis and Guruswami [23j initiated the study of clustering when the maximum number of 
clusters that we are allowed to use is stipulated to be a fixed constant p. As observed by them, this 
type of clustering is well-motivated in settings where the number of clusters might be an external 
constraint that has to be met. It appeared, that p-clustering variants posed new and non-trivial 
challenges. In particular, in spite of the APX-hardness of general case, Giotis and Guruswami |24j 
gave a PTAS for this version of the problem. 

In parameterized complexity the problem was studied under the name of Cluster Editing. 

Cluster Editing 

Input: A graph G = {V, E) and a non-negative integer k. 
Parameter: k. 

Question: Is there a cluster editing set for G of size at most k? 



The parameterized version of Cluster Editing, and variants of it, were studied intensively 
l8l[9l[IOl[IIl[I3II7l|20l[25l[27l[28l|3ll|3^ The problem is solvable in time 0(1.62'^' + n + m) 
and it has a kernel with 2k vertices \13\ I16j (see Section [2] for the definition of a kernel). 

A cluster graph G is called a p-cluster graph if it has p connected components or, equivalently, 
if it is a vertex-disjoint union of p cliques. Similarly, a set F is a p-cluster editing set of G, 
if G' = (y, EAF) is a p-cluster graph. We also study the following variation of the clustering 
problem. 

p- Cluster Editing 

Input: A graph G = {V, E) and a non-negative integer k. 
Parameter: k. 

Question: Is there a p-cluster editing set for G of size at most k? 

Shamir et al. [53] showed that ^-Cluster Editing is NP-complete for every fixed p >2. A kernel 
with (p + 2)k + p vertices was given by Guo |26| . 

Subexponential complexity. In this paper we establish several results around the (im)possibility 
of solving Cluster Editing in subexponential time. Flum and Grohe Chapter 16] defined 
the complexity class SUBEPT, which, loosely speaking — we skip here some technical conditions — is 
the class of problems solvable in time 2°^^^n''^^^\ where n is the input length and k is the parame- 
ter. This is a very interesting class because the problems from SUBEPT are "easier" than "usual" 
parameterized problems. To make this statement more concrete, we need a well-known complexity 
hypothesis formulated by Impagliazzo, Paturi, and Zane |29j . 

Exponential Time Hypothesis (ETH): There is a positive real s such that 3-CNF- 
SAT with n variables and m clauses cannot be solved in time 2*"(?7- -|- m)''^^^\ 

This hypothesis is widely applied in the theory of exact exponential algorithms for hard prob- 
lems, which are better than the trivial exhaustive search, though still exponential [22J. Flum and 
Grohe have shown that most of the natural parameterized problems are not in SUBEPT unless 
ETH fails [21j. Thus it is most likely that the majority of parameterized problems are not solvable 
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in subexponential time. Until very recently, the only problems known to be in the class SUBEPT 
were the problems with additional constraints on the input, like being a planar, i/-minor-free, or 
tournament graph [3l|T8]. However, recent algorithmic developments indicate that the structure of 
the class SUBEPT is much more interesting than expected. It appears that a class of parameterized 
problems related to chordal graphs, hke Minimum Fill-in or Chordal Graph Sandwich, are 
in SUBEPT P]. 

There is a striking resemblance between Cluster Editing and Feedback Arc Set on 
Tournaments (FAST), and a number of generic algorithmic approaches can be applied for both 
problems OH]. By a result of Alon, Lokshtanov, and Saurabh [3J, FAST is in SUBEPT. Based 
on this, Cao and Chen [13] conjectured that Cluster Editing is also solvable in subexponential 
time. While we resolve negatively this conjecture, a refinement of the problem, the case when the 
number of clusters is bounded, belongs to SUBEPT. 

Our results. Our main result is the following theorem establishing the membership of jj-Cluster 
Editing in SUBEPT. 

Theorem 1. ^-Cluster Editing is solvable in time 0{2^^^^^ + m + n) on instances with n 
vertices and m edges. 

Let us remark that by Theorem [!} p-CLUSTER Editing is in the class SUBEPT for p = o{k). 
The ideas used to prove Theorem [l] are quite different from what was used for FAST [3j or Minimum 
Fill-in |23j . The crucial observation is that a p-cluster graph has 2^^^^^^ cuts of size at most k 
(henceforth called k-cuts). As in a YES-instance to the |?-Cluster Editing problem, each fc-cut 
is a 2A:-cut of a p-cluster graph, we infer a similar bound on the number of cuts if the input instance 
is a YES-instance. This allows us to use dynamic programming over the set of fc-cuts of the input 
graph. Together with a kernelization algorithm for p-CLUSTER Editing, this yields Theorem [T} 

We complement Theorem[T]with two lower bounds. Our first lower bound is based on Theorem[2] 
Its subsequent Corollaries [T] and [2] show that the exponential time dependence of our algorithm is 
asymptotically tight for any reasonable choice of parameters p and k. 

Theorem 2. For any e > there is 5 > and a polynomial-time algorithm that, given positive 
integers p and k and a 3-CNF-SAT formula ^ with n variables and m clauses, such that k,n> ep 
and n,m < \/J)k/e, computes a graph G and integer k' , such that k' < 6k, \V{G)\ < 6\/pk and 

• if^ is satisfiable, then there exists a 6p-cluster graph Gq with V{G) = V{Gq) and \E[G)1\E{Gq)\ < 
k' ; and 

• if there exists a p' -cluster graph Gq with p' < 6p, V{G) = V{Go) and \E{G)AE{Go)\ < k' , 
then <I> is satisfiable. 

The statement of Theorem [2] may look very technical, so let us now shortly elaborate about its 
consequences. Recall that an existence of a subexponential, in both the number of variables and 
clauses, algorithm for verifying satisfiability of 3-CNF-SAT formulas would violate ETH |29j . 

Corollary 1. Unless ETH fails, for every < a < 1, there is p = 0(A;'^) such that ^-Cluster 
Editing is not solvable in time 2°^^^V{G)\^^^'> . 
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Proof. Assume we are given a 3-CNF-SAT formula ^ with n variables and m clauses. If n < m, 
\{m — n)/2] times perform the following operation: add three new variables x, y and z, and clause 
(x V 2/ V z). In this way we preserve the satisfiability of increase the size of <I> by a constant 
factor, and ensure that n > m. 

2 2ct 

Take now k = [ni+<^] , p = [ni+<^] . As n > m and < o" < 1, we have k,n > p and n,m < y/pk 
but n,m = Q{\/pk). Invoke Theorem [2] for e = 1 and feed the reduction algorithm with the formula 
^> and parameters p and k, obtaining a graph G and a parameter k' . Note that 6p = Q{k^). Apply 
the assumed algorithm for the p-CLUSTER Editing problem to the instance {G,6p,k'). In this 
way we resolve the satisfiability of ^> in time 2°'^^^\V{G)\^^'^^ = 2°("+"), contradicting ETH. □ 

Corollary 2. Unless ETH fails, there does not exist a constant p > 6 and an algorithm that solves 
P-Cluster Editing in time 2°^^^\V{G)\^^^^ or 2°(l^('^)l) for a fixed number of p clusters. 

Proof. We prove the corollary for p = 6; the claim for larger values of p can be easily obtained by 
adding p — Q large cliques to the graph obtained in the reduction. 

Assume we are given a 3-CNF-SAT formula <I> with n variables and m clauses. Take k = 
max(n, m)^, invoke Theorem [2] for e = 1 and feed the reduction algorithm with the formula ^ and 
parameters 6 and k, obtaining a graph G and a parameter k'. Note that |y(G)| = 0{y/k). Apply 
the assumed algorithm for the ^-Cluster Editing problem to the instance (G, 6, k'). In this way 
we resolve the satisfiability of $ in time 2°(^)|y(G)|'^(i) = 2°("+'"), contradicting ETH. □ 

Let us remark that Theorem [2] and Corollary [T] do not rule out a possibility that Cluster 
Editing is solvable in subexponential time. Our second lower bound shows that when the number 
of clusters is not constrained, then the problem cannot be solved in subexponential time unless 
ETH fails. This disproves the conjecture of Cao and Chen [13]. We note that Theorem [s] was 
independently obtained by Komusiewicz in his PhD thesis |3U| . 

Theorem 3. Unless ETH fails, Cluster Editing cannot be solved in time 2°(^)n*-^(-'^) . 

Let us remark that the proof of Theorem [3] holds for the case when the number of clusters p 
is Theorems [1] and [3] show that a phase transition of the problem complexity occurs when 

the number of clusters switch from sublinear to linear function of /c. It is also worth to note that 
Theorems [T] and [3] establish interesting parallels between parameterized complexity and approx- 
imability of clustering problems. As we already mentioned, bounding the number of clusters drops 
the complexity of the problem drastically — from APX- hardness to PTAS |14^ i24j . By Theorems [l] 
andjsj exactly the same phenomena occurs for subexponential-time solvability of clustering. 

2 Preliminaries 

We denote by G = {V, E) a finite, undirected, and simple graph with vertex set V{G) = V and 
edge set E[G) = E. We also use n to denote the number of vertices and m the number of edges in 
G. For a nonempty subset W (^V , the subgraph of G induced by W is denoted by We say 

that a vertex set W C.V is connected if G[VF] is connected. The open neighborhood of a vertex v 
is N{v) = {u : uv £ E} and the closed neighborhood is N[v] = N{v) U {v}. For a vertex set 
W QV we put N{W) = U^evK ^(^) \ ^ and N[W] = N{W) U W. For graphs G, H, by HiG, H) 
we denote the number of edge modifications needed to obtain H from G, i.e, the number of edges 
present in G and not present in H plus vice versa. 
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Parameterized complexity. A parameterized problem 11 is a subset of F* x N for some finite 
alphabet F. An instance of a parameterized problem consists of {x,k), where k is called the 
parameter. A central notion in parameterized complexity is fixed-parameter tractability (FPT) 
which means, for a given instance {x,k), solvability in time f{k) -p^xl), where / is an arbitrary 
computable function of k and p is a polynomial in the input size. We refer to the book of Downey 
and Fellows [T^ for further reading on parameterized complexity. 

Kernelization. A kernelization algorithm for a parameterized problem FT C F* xN is an algorithm 
that given {x,k) G F* x N outputs in time polynomial in |x| + /c a pair {x',k') G F* x N, called 
a kernel such that {x,k) € FT if and only if {x',k') £ FT, \x'\ < g{k), and k' < k, where g is some 
computable function. 

We need the following result of Guo [26] . 

Proposition 4 (^26j). Cluster Editing admits a kernel with {p+2)k+p vertices. The running 
time of the kernelization algorithm is 0{n+m), where n is the number of vertices and m the number 
of edges in the input graph G 

The following lemma is used in both our lower bounds. Its proof is almost identical to the proof 
of Lemma 1 in [26], and we provide it here for reader's convenience. 

Lemma 5. Let G = {V,E) be an undirected graph and X QV be a set of vertices such that G[X] 
is a clique and each vertex in X has the same set of neighbors outside X (i.e., Ng[v] = Ng[X] for 
each V £ X). Let F CV xV be a set such that GAF is a cluster graph where the vertices of X are 
in at least two different clusters. Then there exists F' C V xV such that: (i) \F'\ < \F\, (ii) GAF' 
is a cluster graph with no larger number of clusters than GAF, (Hi) in GAF' the clique G[X] is 
contained in one cluster. 

Proof For a vertex v e X, let F{v) = {u ^ X : vu £ F}. Note that, since Ng[v] = Ng[X] for all 
V G X, we have F{v) = F{w) if v and w belong to the same cluster in GAF. 

Let Z be the vertex set of a cluster in GAF such that there exists v £ Z D X with smallest 
1^(^)1. Construct F' as follows: take F, and for each w £ X replace all elements of F incident with 
w with {uw : u G F{v)}. In other words, we modify F by moving all vertices of X \ Z to the cluster 
Z. Clearly, GAF' is a cluster graph, X is contained in one cluster in GAF' and GAF' contains no 
more clusters than GAF. To finish the proof, we need to show that |-F'| < |-F|. The sets F and F' 
contain the same set of elements not incident with X. As was minimum possible, for each 

w £ X we have > \F'{w)\. As X was split between at least two connected components of 

GAF, F contains at least one edge of (^[X], whereas F' does not. We infer that \F'\ < \F\ and the 
lemma is proven. □ 

3 A sub exponential algoritlim for j9-Cluster Editing 

In this section we prove Theorem [l| that is, we show a 0{2'~'^^^^ + n + m)-time algorithm for 
^-Cluster Editing. 

3.1 Reduction for large p 

The initial step of our algorithm consist of simple preprocessing rules reducing the instance to an 
equivalent instance with p < 6k. 
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Note that at most 2k vertices can be incident to the elements of F. Thus, when p > 2k, the 
input instance needs to contain at least p — 2k connected components that are cliques and are left 
untouched in GAF. However, even if G contains a lot of connected components that are cliques, 
we may want to merge or split some of these cliques to obtain exactly p clusters. 

Consider the case that p > 6k. If G contains less than p — 2k connected components that are 
cliques then {G,p,k) is a NO-instance. If G contains more than 2k isolated vertices, at least one 
of these vertices is not incident to an element of F, thus we may delete one isolated vertex and 
decrease p by one. 

Rule 1 If G contains 2k + 1 isolated vertices, pick one of them, say v, and delete it from G. New 
instance is {G\v,p — 1, k). 

We are left with the case where G contains more than 2k connected components that are cliques, 
but not isolated vertices. At least one of these cliques does not contain a vertex that is incident to 
an element of F. As discussed above, the cliques may be merged with other vertices (to decrease the 
number of connected components) , or split into more clusters (to increase the number of connected 
components). In both cases, we can greedily merge or split the smallest possible cluster. Thus, 
without loss of generality, we can assume that the largest connected component of G that is a 
clique is left untouched in GAF. We reduce the input instance (G, p, k) by deleting this cluster 
and decreasing p by one. 

Rule 2 If G contains 2k + 1 isolated cliques that are not isolated vertices, pick a clique G of largest 
size and delete it from G. The new instance is (G \ G,p — 1, fc). 

By the arguments above, every YES-instance (G, p, k) for which none of the reduction rules 
applies, has p < 6k. For the rest of this section we assume that p < 6k. 

3.2 Binomial coefficient bounds 

Before we proceed with the description of the core part of the algorithm, we need several purely 
mathematical technical bounds. 

Lemma 6. // a, b are positive integers, then ("^^) < ^'^^a\b — • 

Proof. In the proof we use a folklore fact that the sequence = (1 + 1/n)" is increasing. This 

implies that (l + I)' < (l + ^f^', equivalent^ < (^±gj^. 

Let us fix a; we prove the claim via induction with respect to b. For 6 = 1 the claim is equivalent 
to a" < (a + 1)" and therefore trivial. In order to check the induction step, notice that 

a + b + l\ _ a + b+1 fa + b\^a + b + l (a + 5)"+^ 



a 



< 



6+1 \ a J ~ 6+1 a^b^ 
a + 6+1 (g + 6 + 1)°+'' _ (a + 6 + l)°+''+i 
6+1 a"(6 + l)'' ~ a«(6+l)''+i ' 

Lemma 7. Ifa,b are nonnegative integers, then ("„^) < 2^^^. 



□ 
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Proof. Firstly, observe that the claim is trivial for a = or 6 = 0; hence, we can assume that 
a,b > 0. Moreover, without losing generality assume that a < b. Let us denote ^/ab = £ and = t, 
then < t < 1. By Lemma [6] we have that 



a + b 
a 



< 



ja + b) 



a+b 



1 + 



Vi 
1\ 1 



Let us denote g{x) = xln (l + x'"^) + x'^ In (l + x^). As < e^'^(^), it suffices to prove that 

g{x) < 2 In 2 for all < X < 1. Observe that 



^'(x) = In (l + x"^) - X • 2x"^ • ^ ^ _^ - x"^ In (l + x^) + x"^ • 2x • - 

2 



In ( 1 + X 



-2 



+ x^ 



x^^lnfl 



1 + x2 

ln(l + x"^) -x~2ln(l + x2) . 



+ x2) + - 



+ x^ 



Let us now introduce h : (0, 1] -+ R, = g'{^) = In (l + y-^) - y-^ In (1 + y). Then, 



h'{y) 



-y 



— ^ + In (1 + y) - • 

1 +y ^ 1+y 



y ^ln(l + y) 



We claim that h'{y) < for all y < 1. Indeed, from the inequality ln(l + y) < y we infer that 



y ^ln(l + y) < y 



-1 



2 2 

< 



y + y y + y^ 



Therefore, h'{y) < for y G (0,1], so /i(y) is non-increasing on this interval. As /i(l) = 0, this 
implies that h{y) > for y G (0,1], so also y'(x) > for x G (0,1]. Theat means that y(x) is 
non-decreasing on the interval (0, 1], so y(x) < ^(1) = 21n2. □ 



3.3 Small cuts 

We can now slowly proceed to the algorithm itself. Let us introduce the key notion. 

Definition 8. Let G = {V,E) be an undirected graph. A partition (Vi, V2) ofV is called a A;-nice 
partition of G if \E{Vi, ^2)! < A:. 

Firstly, we observe that /c-nice partitions can be quickly enumerated. 

Lemma 9. k-nice partitions of a graph G can be enumerated with polynomial delay. 

Proof. We follow the standard branching. We order the vertices arbitrarily, start with empty Vi, 
V2 and for each consecutive vertex v we branch into two subcases: we put v either into Vi or into 
V2. Once the alignment of all vertices is decided, we output the partition. However, each time we 
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put a vertex in one of the sides, we run a polynomial-time max-flow algorithm to check whether the 
minimum edge cut between Vi and V2 constructed so far is at most k. If not, then we terminate 
this branch as it certainly results in no solutions found. Thus, we always pursue a branch that 
results in at least one feasible solution, and finding the next solution occurs within a polynomial 
number of steps. □ 

Intuitively, /c-nice partitions of the graph G form the search space of the algorithm. Therefore, 
we would like to bound their number. This we do by firstly bounding the number of nice partitions 
of a cluster graph, and then using the fact that a YES-instance is not very far from some cluster 
graph. 

Lemma 10. Let K be a cluster graph containing at most p clusters, where p < 6k. Then the 
number of k -nice partitions of K is at most 2®^^. 

Proof. By somewhat abusing the notation, assume that K has exactly p clusters, some of which 
may be empty. Let Ci, C2, . . . , Cp be these clusters and ci, C2, . . . , Cp be their sizes, respectively. 
We firstly establish a bound on the number of partitions (Vi, V2) such that the cluster Ci contains 
Xi vertices from Vi and m from V2. Then we discuss that the number of ways of selecting pairs 
Xi, Ui summing up to q, for which the number of A;- nice partitions is positive, can be bounded in a 
similar manner. Multiplying the obtained two bounds gives us the claim. 

Having fixed the numbers Xi,yi, the number of ways in which the cluster Ci can be partitioned 
is equal to {^^^^^)- Note that (^'^^') < 2^v^^^ by Lemma j7| Observe that there are XiUi edges 
between Vi and V2 inside the cluster Ci, so if (Vi,V2) is a fc-nice partition, then Xjyj < k. 

By applying the Cauchy-Schwarz inequality we infer that Yl^=i yJ^iVi ^ \/P " \/Yl^i=i ^iVi — Vp^- 
Therefore, the number of considered nice partitions is bounded by 



n 



Xi 



Moreover, observe that mm{xi,yi) < ^/xiiji; hence, ^^L;^ min(xj, y^) < ^/pk. Therefore, the choice 
of Xi,yi can be modeled by first choosing for each i, whether mm{xi,yi) is equal to Xi or to yi, and 
then expressing [-y/pA;] as the sum of p + 1 nonnegative numbers: min(xj, i/j) for 1 < i < p and the 
rest, [VpA;] — min(a;i, yi). The number of choices in the first step is equal to 2^ < 2^'^^, and 
in the second is equal to ^Lv^J+p^ < 2v^+v^6pA:^ Therefore, the number of possible choices of Xi,yi 

is bounded by 2^^+^^)^/*^ < 2^^^. Hence, the total number of fc-nice partitions is bounded by 
26Vpk . = 2'^v^, as claimed. □ 

Lemma 11. If {C,p,k) is a YES-instance of p-Cluster Editing with p < 6k, then the number 
of k -nice partitions of C is bounded by 2^^^^. 

Proof. Let be a cluster graph with at most p clusters such that 1-L{G,K) < k. Observe that 
every A;-nice partition of G is also a 2/c-nice partition of K, as K differs from G by at most k edge 
modifications. The claim follows from Lemma [TOl □ 

3.4 The algorithm 

We are now ready to prove Theorem [T] 
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Proof of Theorem\^ Let (G = {V,E),p,k) be the given ^-Cluster Editing instance, and let 
{V,E) be the complement of the graph G. By making use of Proposition [ij we assume that G has 
at most {p+2)k+p vertices, thus all the factors polynomial in the size of G can be henceforth hidden 
within the 2'''^^^^ factor. Application of Proposition [4] gives the additional 0{n + m) summand to 
the complexity. By further using the reduction rules introduced in Section [3. 1| we can also assume 
that p < 6k. 

We now enumerate the A:-nice partitions of G with polynomial delay. If we exceed the bound 



28v2pfc given by Lemma 11 we know that we can safely answer NO, so we immediately terminate 
the computation and give a negative answer. Therefore, we can assume that we have computed 
the set J\f of all A:-nice partitions of G and \J\f \ < 2^^^. 

Assume that {G,p, k) is a YES-instance and let K he a, cluster graph with at most p clusters 
such that K) < k. Again, let Ci, G2, . . . , Cp be the clusters of K, where, by somewhat abusing 
the notation, some of these clusters can be empty. Observe that for every j £ {0, 1,2,... ,p}, the 

partition ^Ui=i ^(^'j)? Uf=j+i ^(C'i)) to be /c-nice, as otherwise there would be more than k 
edges that need to be deleted from G in order to obtain K. This observation enables us to use a 
dynamic programming approach on the set of nice partitions. 

We construct a directed graph D, whose vertex set is equal to AAxjO, 1, 2, . . . ,p} x{0, 1, 2, . . . , k}; 
note that \V{D)\ = 2^^^\ We create arcs going from {{Vi, V2), J,^) to {{¥{, V^),j + l,f ), where 
Vi C VI (hence V2 5 V^), j G {0, 1, 2, . . . ,p - 1} and £' = i + \E{Vi, V[ \Vx)\-r \E{y[ \ Vi, ¥{ \Vi)\. 
The arcs can be constructed in 2^^-^^^ time by checking for all the pairs of vertices whether they 
should be connected. We claim that the answer to the instance A;) is equivalent to reachability 
of any of the vertices of form ((y, 0),]?, t) from the vertex ((0, F), 0, 0). 

In one direction, if there is a path from ((0,1/), 0,0) to ((!/, 0),p, ^) for some i < k, then the 
consecutive sets V( \ V\ along the path form clusters Cj of a cluster graph whose editing distance 
to G is accumulated on the last coordinate, thus bounded by k. In the second direction, if there 
is a cluster graph K with clusters Ci,C2, . . . ,Cp within editing distance at most k from G, then 



form a path from 



es are indeed vertices of the graph D, 



vertices ((uLi l^(Q),UL,+i ^(^O) ,J,^ [ULi ^(C, 
((0,1/), 0,0) to ((I/,0),p,?^(G,K)). Note that ah these trip 
as ^Ui=i ^(C'i), UiLj+i ^(C'i)^ are A;-nice partitions of G. 

Reachability in a directed graph can be tested in linear time with respect to the number of 
vertices and arcs, using, for example, breadth-first search. We can now apply this algorithm to the 
graph D and conclude solving the p-CLUSTER Editing instance in 0{2^^^^^ + n + m) time. □ 



4 Multivariate lower bound 



This section is devoted to the proof of Theorem [2] The proof consists of four parts. In Section 4.1 



we preprocess the input formula $ to make it more regular. Section 4.2 contains the details of the 
construction of the graph G. In Section 4.3 we show how to translate a satisfying assignment of ^ 
into a 6p-cluster graph Go close to G and we provide a reverse implication in Section 4.4, In the 



proof we treat e as a constant and hide the factors depending on it in the O-notation. That is, the 
constants in the O-notation correspond to the factor b in the statement of Theorem [2] 
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4.1 Preprocessing of the formula 

We start with a step that regularizes the input formula while increasing its size only by a constant 
factor. The purpose of this step is to ensure that, when we translate a satisfying assignment of $ 
into a cluster graph Gq in the completeness step, the clusters are of the same size, and therefore 
contain the minimum possible number of edges. This property is used in the argumentation of the 
soundness step. 

Lemma 12. For any fixed e > 0, there exists a polynomial-time algorithm that, given a 3-CNF 
formula $ with n variables and m clauses and an integer p, ep < n, constructs a 3-CNF formula 
with n' variables and m! clauses together with a partition of the variable set Vars($') into p parts 
Vars*", I <r <p, such that following properties hold: 

(a) is satisfiable iff ^ is; 

(b) in every clause contains exactly three literals corresponding to different variables; 

(c) in ^' every variable appears exactly three times positively and exactly three times negatively; 

(d) n' is divisible by p and, for each 1 <r <p, iVars*"! = n' /p (i.e., the variables are split evenly 
between the parts Vars^ 

(e) if ^' is satisfiable, then there exists a satisfying assignment o/Vars($') with the property that 
in each part Vars'' the same number of variables is set to true as to false. 

(f) n' -\- m! = 0{n -\- m) , where the constant hidden in the O -notation depends on s. 

Proof. We modify $ while preserving satisfiability, consecutively ensuring that properties (b), (c), 
(d), and (e) are satisfied. Satisfaction of (f) will follow directly from the constructions used. 

First, delete every clause that contains two different literals corresponding to the same variable, 
as they are always satisfied. Remove copies of the same literals inside clauses. Until all the clauses 
have at least two literals, remove every clause containing one literal, set the value of this literal so 
that the clause is satisfied and propagate this knowledge to the other clauses. At the end, create a 
new variable p and for every clause C that has two literals replace it with two clauses CM p and 
C V -ip. All these operations preserve satisfiability and at the end all the clauses consist of exactly 
three different literals. 

Second, duplicate each clause so that every variable appears an even number of times. Intro- 
duce two new variables g,r. Take any variable x, assume that x appears positively times and 
negatively k" times. If < k~ , introduce clauses (x V g V r) and (x V -iq' V -r), each ^ 
times, otherwise introduce clauses (-ix V g V r) and (-ix V -ig V -ir), each ^"""^^ times. These oper- 
ations preserve satisfiability (as the new clauses can be satisfied by setting q to true and r to false) 
and, after the operation, every variable appears the same number of time positively as negatively 
(including the new variables q, r) . 

Third, copy each clause three times. For each variable x, replace all occurences of the variable 
X with a cycle of implications in the following way. Assume that x appears 6d times (the number of 
appearances is divisible by six due to the modifications in the previous paragraph and the copying 
step). Introduce new variables Xj for 1 < z < 3d, yi for 1 < i < d and clauses (-iXj V Xj+i V ypi/s]) 
and (-iXj V Xj+i V ^yii/s]) for 1 < i < 3d (with xs^+i = xi). Moreover, replace each occurence of 
the variable x with one of the variables Xj in such a way that each variable Xi is used once in a 
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positive literal and once in a negative one. In this manner each variable Xi and yi is used exactly 
three times in a positive literal and three times in a negative one. Moreover, the new clauses form 
an implication cycle xi =^ 2:2 =^ • • • =^ a^3d ^ xi, ensuring that all the variables Xi have equal value 
in any satisfying assignment of the formula. 

Fourth, to make n' divisible by p we first copy the entire formula three times, creating a new 
set of variables for each copy. In this way we ensure that the number of variables is divisible by 
three. Then we add new variables in triples to make the number of variables divisible by p. For 
each triple x,y,z of new variables, we introduce six new clauses: all possible clauses that contain 
one literal for each variable x, y and z except for {x y y V z) and (-ix V -ly V ^z). Note that the 
new clauses are easily satisfied by setting all new variables to true, while all new variables appear 
exactly three times positively and three times negatively. Moreover, as initially ep < n, this step 
increases the size of the formula only by a constant factor. 

Finally, to achieve (d) and (e) take ^' = <I> A <I>, where $ is a copy of ^ on a disjoint copy of the 
variable set and with all literals reversed, i.e., positive occurrences are replaced by negative ones 
and vice versa. Of course, if is satisfiable then $ as well, while if is satisfiable, then we can 
copy the assignment to the copies of variables and reverse it, thus obtaining a feasible assignment 
for <!>'. Recall that before this step the number of variables was divisible by p. We can now partition 
the variable set into p parts, such that whenever we include a variable into one part, we include 
its copy in the same part as well. In order to prove that the property (e) holds, take any feasible 
solution to truncate the evaluation to Vars($) and copy it while reversing on □ 



4.2 Construction 

In this section we show how to compute the graph G and the integer k' from the formula ^' given 



by Lemma [12} As Lemma 12 increases the size of the formula by a constant factor, we have that 
n',m' = 0{y/pk) and iVars*"! = n'/p = 0{\/k/p) for 1 < r < p. 

Observe that in the statement of the Theorem [2] we can safely assume that e < 1, as the 
assumptions become more and more restricted as e becomes smaller. From now on we assume that 
e < 1. 



Let L = 1000 • \1 + = 0{\Jk/p). For each part Vars*^, 1 < r < p, we create six cliques Q^, 
1 < a < 6, each of size L. Let Q be the set of all vertices of all cliques Q^. In this manner we 
have Qp cliques. Intuitively, if we seek for a 6p-cluster graph close to G, then the cliques are large 
enough so that merging two cliques is expensive — in the intended solution we have exactly one 
clique in each cluster. 

For every variable x G Vars^ we create six vertices u^f 2 > ^2 3 1 • • • ' ^5 6 ' ^6 1 • Connect them into 
a cycle in this order; this cycle is called a 6-cycle for the variable x. Moreover, for each 1 < a < 6 
and V £ V{Qa), create edges vw^_i ^ and vw^ (we assume that the indices behave cyclicly, i.e., 
Wqj = Wq i, = Qi etc.). Let W be the set of all vertices w^^^_^_l for all variables x. Intuitively, 
the cheapest way to cut the 6-cycle for variable x is to assign the vertices w^a+it 1 < a < 6 all 
either to the clusters with cliques with only odd indices or only with even indices. Choosing even 
indices corresponds to setting x to false, while choosing odd ones corresponds to setting x to true. 
Let r{x) be the index of the part that contains the variable x, that is, x G Vars^^^^ 
In each clause C we (arbitrarily) enumerate variables: for 1 < r/ < 3, let var(C, rj) be the 
variable in the 77-th literal of C, and sgn(C, r/) = if the Tj-th literal is negative and sgn(C, r]) = 1 
otherwise. 
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For every clause C create nine vertices: s^^ for 1 < /3, ^ < 3. The edges incident to the vertex 



are defined as follows: 



for each 1 < r/ < 3 create an edge ■s^^if2^+2r;-3 2/3+2»?-2' 
is adjacent to dependi: 

■.r(var(C,»y)) 



if ^ = 1, for each 1 < < 3 connect s^^ to all vertices of one of the cliques the vertex 
^2^+2r;-3 2/3+2»?-2 adjacent to depending on the sign of the 77-th literal in C, that is, the 



clique Q2/3+2»)-2-sgn(C,r7)' 

• if > 1, for each 1 < r? < 3 connect ^ to all vertices of both cliques the vertex w^p^^}_^ 2/3+2r;-2 
is adjacent to, that is, the cliques Q2p+2t)-z^ ^"^^ Q2p+2r}-'P ■ 

We note that for a fixed vertex s^^, the aforementioned cliques ^ is adjacent to are pairwise 
different, and they have pairwise different subscripts (but may have equal superscripts, i.e., belong 
to the same part). See Figure [T] for an illustration. 

Let S be the set of all vertices ^ for all clauses C. If we seek a 6p-cluster graph close to 
the graph G, it is reasonable to put a vertex in a cluster together with one of the cliques this 
vertex is attached to. If s^^ is put in a cluster together with one of the vertices w^2p+2'r)-z 2/3+2r)-2 
for 1 < < 3, we do not need to cut the appropriate edge. The vertices verify the assignment 
encoded by the variable vertices w^^^j^i, the vertices ^"^^ ^^3 ^^Ip us to make all clusters be 
of equal size (which is helpful in the soundness argument). 

We note that \V{G)\ = 6pL + 0{n' + m') = 0{y/pk). 

We now define the budget k' for edge editings. To make the presentation more clear, we split 
this budget into few summands. Let 

/ 6n'+9m' \ 

fcs-s = 0, A:q_w5 = (6n' + 36m')L, k^s-WS = ^p{ % I 



and finally 

U' I, \ U I Z-EiU I i,oxist r)T.save ni ssxe 

Note that, as p < k, L = 0{y^k/p) and n' , m' = 0{\/pk), we have k' = 0{k). 

The intuition behind this split is as follows. The intended solution for the |?-Cluster Editing 
instance {G,6p,k') creates no edges between the cliques Q^, each clique is contained in its own 
cluster, and kQ^Q = 0. For each v £ W U S, the vertex v is assigned to a cluster with one clique v 
is adjacent to; A;q_w5 accumulates the cost of removal of other edges in E{Q, W U S). Finally, we 
count the editings in (WUS) x (WUS) in an indirect way. First we cut all edges of E{WUS, WUS) 
(summand fc>v5-Wcs)- group the vertices of WUcS into clusters and add edges between vertices 
in each cluster; the summand ^VV5_W5 corresponds to the cost of this operation when all the 
clusters are of the same size (and the number of edges is minimum possible). Finally, in summands 
^>v!fw ^>v^5 count how many edges are removed and then added again in this process: 
^>v!fw corresponds to saving three edges from each 6-cycle in E{yV, W) and k^^s corresponds to 
saving one edge in E{'W,S) per each vertex s?,. 
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Figure 1: A part of the graph G created for the clause C = (-ix V -ly V z), with var(C, 1) = x, 
var(C, 2) = y and var(C, 3) = z. Note that the parts r(a;), r(y) and r{z) may be not be pairwise 
distinct. However, due to the rotation index /3, in any case for a fixed vertex the chques this 
vertex is adjacent to on this figure are pairwise distinct and have pairwise distinct subscripts. 
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4.3 Completeness 

We now show how to translate a satisfying assignment of the input formula ^ into a 6p-cluster 
graph close to G. 

Lemma 13. // the input formula $ is satisfiable, then there exists a 6p-cluster graph Gq on vertex 
set V{G) such that HiG, Gq) = k' . 



Proof. Let (p' be a satisfying assignment of the formula as guaranteed by Lemma 12 Recall that 
in each part Vars*^, the assignment (p' sets the same number of variables to true as to false. 

To simplify the presentation, we identify the range of (p' with integers: (p'ix) = if a; is evaluated 
to false in cp' and (p'{x) = 1 otherwise. Moreover, for a clause C by ri{C) we denote the index of an 
arbitrarily chosen literal that satisfies C in the assignment (p' . 

We create 6p clusters K^, 1 < r < p, 1 < a < 6, as follows: 

• Qa ^ i^a for 1 < r < 1 < a < 6; 

• for X G Vars($'), if (/^'(x) = 1 then u;f^i,u'f 2 e ^^^f.s- ^^3,4 ^ ^3^''\ w^4,5>«^5,6 ^ ^^''^ 

• for X G Vars($'), if (l^'ix) = then u^f^s-^s ^ K2^''\ ^3,4,^1,5 e ^h^K,! ^ ^6^""^ 
for each clause C of and 1 < /3, < 3 we define 77 = ri(C) + ^ — 1 and we assign the vertex 

-r(var(C,r;)) 

2/3+2r)-2-0'(var(C,J7))' 



s?. to the cluster 



Note that in this way s^ ^ belongs to the same cluster as its neighbor ^^2^+^^'?^3 2/3+277-2- Figure 
[2] for an illustration. 

Let us now compute T-l{G, Gq). We do not need to add nor delete any edges in G[Q]. We note 
that each vertex v £ WUS is assigned to a cluster with one clique it is adjacent to. Indeed, this 
is only non-trivial for vertices s^^^ for clauses C and 1 < /3 < 3. Note that this vertex belongs to the 

same cluster as the vertex ^2^+^^'(cJ)^2-(/>'{var(Cr?(c)))' since the r/(C)-th literal of G satisfies G 

in the assignment (p', s^ -^^ is adjacent to all vertices of the clique Q2/^f2»7(c)-2-(^'{var(Cr;(C)))' 

Therefore we need to cut fcQ_w5 = (6n' + 36m')L edges in E{Q, W U S): L edges adjacent to 
each vertex 2L edges adjacent to each vertex s^^, and 5L edges adjacent to each vertex 

s^2 ^■nd 5^3. We do not add any new edges between Q and W U 5. 

To count the number of editings in G[WU5], let us first verify that the clusters are of equal 
sizes. Fix cluster i^^, l<a<6, l<r<p. contains two vertices w^_i „ and for each 

variable x with (p'{x) + a being even. Since (p' evaluates the same number of variables in Vars'" to 
true as to false, we infer that each cluster contains exactly n' /p vertices from W, corresponding 
to n'/{2p) = |Vars''|/2 variables of Vars^. 

For 1 < a < 6, let Vars^ = (p~^{0) H Vars*" if a is even and (p^^{l) n Vars*^ if a is odd. That 
is, x G Vars^ if and only if w^_i a+i ^ ^a- By the properties of for each x G Vars^ the 
variable x appears in three clauses positively and in three clauses negatively; in particular, it satisfies 
exactly three clauses in the assignment cp'. We claim that n S consists of 3|VarsQ| = llVars''! 
vertices, that is, for each variable x G VarsJ^, for each clause C (out of three) that x satisfies in the 
assignment (/>', contains exactly one (out of nine) vertex s^^, and no more vertices of 5. 

In one direction, take a variable x G VarsJ^ and a clause G that is satisfied by x in the assignment 
(p'. Let a' = 2 [a/2] , so that w^,_^ ^, G {w^^i a+i} i^ vertex with first subscript odd and 
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Figure 2: Parts of clusters for variables x, y and z with = 1, 4>'{y) = 0, (jy'i^) = 1) a 

clause C = {-^x \/ ^yV z) with var(C, 1) = x, var(C, 2) = y, var(C, 3) = z and r/(C) = 2 (note that 
both y and z satisfy C in the assignment <;!!)', but y was chosen as a representative). 
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the second even. Take r] such that x = var(C, r]) and /3 = a' /2 — rj + l. Then a' = 2f3 + 2ri — 2, and 
the three vertices for 1 < ^ < 3 are adjacent to w^/_^ Now let ^ = r/ — t/(C) + 1; then .s^*^ is 
assigned to the same cluster as w^,_-^ ^, since rj = r]{C) + ^ — 1. Since x G Vars^, then G E'^. 

In the other direction, let ^ G for some clause C and 1 < ,5, ^ < 3. Recall that belongs 
to the same cluster as one of its three neighbors in W. Therefore there exists w^,_^ ^, adjacent to 
that belongs to iiT^; note that ol is even. Moreover, as s'^^ and w^,_^^, are assigned to the 
same cluster, we infer that x satisfies C. As w^, -^ ^, G K^, then x G VarsJ^. Let r] be such that 
X = var(C, rj). As s'^ ^w^,_-^ ^, G E{G), we have a' /2 = /3+r/— 1, that is, /? = a'/2—ri+l. Recall that 
the neighbors of ^ from W have pairwise different subscripts; that is, ^ is adjacent to w^,_^ 

'"^™+i,a'+2 ^^"^ ^a'^+s^'a' +4* • Therefore the cliques that arc adjacent to w^a'^+^^/+2^ and w™+^q'+4^ 
are different from Q^^, and these vertices do not belong to K^,. We infer that if G K^, that is, 
belongs to the same cluster as w^,_-^ then 77 = 77(C) + ^ — 1; equivalently, ^ = rj — r]{C) + 1. 
Hence, s^^ G i^^ only if C is satisfied by a variable in Vars^ and, providing this, for at most one 
choice of the indices 1 < /3, ^ < 3. This concludes the proof of the claim. 

We now count the number of editings in G[W U S\ as sketched in the construction section. The 
subgraph G[WU5] contains 671' + 27m' edges: one 6-cycle for each variable and three edges incident 
to each of the nine vertices for each clause C. Each cluster contains n' /p vertices from W 

and ^ vertices from S. If we deleted all edges in G[>V U S] and then added all the missing edges 
in the clusters, we would make ^vvls-W5 ^VV5-W5 editings, due to the clusters being equal-sized. 
However, in this manner we sometimes delete an edge and then introduce it again; thus, for each 
edge of G[WU5] that is contained in one cluster K^, we should subtract 2 in this counting scheme. 

For each variable x, exactly three edges of the form Wa-i,a'^a,a+i contained in one cluster; 
this gives a total of k^^y^ = 3n' saved edges. For each clause C each vertex s^^ is assigned to a 

cluster with one of the vertices w^^'^2l'l\ 2/3+2?7-2' 1 ^ ^ ^ 3, thus exactly one of the edges incident 
to ^ is contained in one cluster. This sums up to k^^g = 9m' saved edges, and we infer that 
the 6p-cluster graph Go can be obtained from G by exactly k' = kQ-Q + A;q_>v<s + ^>V5-W5 + 

Lall _ 9t,savc _ OyLsavc c,Hitino-« n 

f^WS-WS "'W-W '^"'W-S euiinigb. LJ 

4.4 Soundness 

We need the following simple bound on the number of edges of a cluster graph. 

Lemma 14. Let a, b be positive integers and H be a cluster graph with ah vertices and at most a 
clusters. Then \E{H)\ > 0(2) and equality holds if and only if H is an a-cluster graph and each 
cluster of H has size exactly b. 

Proof. It suffices to note that if not all clusters of H are of size 6, there is one of size at least 6+1 
and one of size at most 6 — 1 or the number of clusters is less than a; then, moving a vertex from 
the largest cluster of to a new or the smallest cluster strictly decreases the number of edges of 
H. □ 

We are now ready to show how to translate a p'-cluster graph Go with p' < 6p, T-L{Gq, G) < k' 
into a satisfying assignment of the input formula 

Lemma 15. // there exists a p' -cluster graph Go with V{G) = F(Go), p' < 6p, H(G, Gq) < k' , 
then the formula $ is satisfiable. 
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Proof. By Lemma [5| we may assume that each cHque is contained in one cluster in Go . Let 
F = E{Go)AE{G) be the editing set, |F| < k'. 

Before we start, we present some intuition. The cluster graph Gq may differ from the one 
constructed in the completeness step in two significant ways, both leading to some savings in the 
edges of G[WU5] that may not be included in F. First, it may not be true that each cluster contains 
exactly one clique Q^. However, since the number of cliques is at most 6p, this may happen only 
if some clusters contain more than one clique Q^, and we need to add edges to merge each pair 
of cliques that belong to the same cluster. Second, a vertex v G W U S may not be contained in 
a cluster together with one of the cliques it is adjacent to. However, as each such vertex needs to 
be separated from all its adjacent clusters (compared to all but one in the completeness step), this 
costs us additional L edges to remove. The large constant in front of the definition of L ensures us 
that in both these ways we pay more than we save on the edges of G[W U S]. We now proceed to 
the formal argumentation. 

We define the following quantities. 

^Q-Q = |Fn(Q X Q)|, = \F n Eg{Q,W U S)\, 

i^S-ws = \E{GoiWuS))\, 



Recall that ^vvH-WcS ~ \E{G{W U S))\ = Qn' + 27m'. Similarly as in the completeness proof, we 
have that 

I Pi \ « I e _L ^S'U _L ^exist Q/isave ri/isave 

Indeed, (.q-q and £g_w5 count (possibly not all) edges of F that are incident to the vertices of 
Q. The edges of F n ((W U 5) x (W U S)) are counted in an indirect way: each edge of G[W U S] 
is deleted (^vvlf-VV^) ^^"^ each edge of Go[W U 5] is added (^VV5-W5)- Then, the edges that are 
counted twice in this manner are subtracted {ly^^y^ and G^lg). 

We say that a cluster is crowded if it contains at least two cliques and proper if it contains 
exactly one clique Q^. A clique that is contained in a crowded (proper) cluster is called a 
crowded {proper) clique. 

Let a be the number of crowded cliques. Note that 

^S-S - ^S-S = n (Q X Q)| - > 

as each vertex in a crowded clique needs to be connected to at least one other crowded clique. 

We say that a vertex t> G W U 5 is attached to a clique Q"^, if it is adjacent to all vertices of the 
clique in G. Moreover, we say that a vertex f G W U 5 is alone if it is contained in a cluster in Gq 
that does not contain any clique v is attached to. Let n^'^°'^° be the number of alone vertices. 

Let us now count the number of vertices a fixed clique Q^^ is attached to. Recall that iVars**! = 
n' /p. For each variable x E Vars'' the clique QJ^ is attached to two vertices w^_i ^ and Waa+i- 
Moreover, each variable x G Vars^ appears in exactly six clauses: thrice positively and thrice 
negatively. For each such clause C, is attached to the vertex ^'^^ exactly one choice of the 
value 1 < /? < 3 and to the vertex s^g for exactly one choice of the value 1 < /3 < 3. Moreover, if 
X appears in C positively and a is odd, or if x appears in G negatively and a is even, then is 
attached to the vertex ^ for exactly one choice of the value 1 < /3 < 3. We infer that the clique 



17 



QJ^ is attached to exactly fifteen vertices from S for each variable x S Vars''. Therefore, there are 
exactly 17|Vars^| = 17 n' /p vertices of W U 5 attached to Q^: 2n' /p from W and 15n'/p from S. 

Take an arbitrary vertex t; G W U 5 and assume that v is attached to by cliques, and out of 
them are crowded. As F needs to contain all edges of G that connect v with cliques that belong 
to a different cluster than we infer that \F n Eg{{v}, Q)\ > {h^ — max(l, a^))L. Moreover, if v 
is alone, \F n Eg{{v}, Q)\ > b^L > 1 • L + (b^ — max(l, ay))L. Hence 



tQ_^^s = \F n EGiQ,W U S)\ > n^'°°U+ ^ (6^ -max(l, a^))L 



vewus v^wus 

Recall that Ylvewusi^^ ~ ^)-^ ~ ^Q-WS- Therefore, using the fact that each clique is attached to 
exactly 17n' /p vertices of W U 5, we obtain that 

Iq.ws - kQ^ws = \FnEG{Q,WUS)\- kQ^ws > n''^°''''L - ^ ""^^ ^ n'^^°°^L - l7aLn'/p. 
In Go, the vertices of W U 5 are split between p' < 6p clusters and there are 6n' + 9m' of them. 



By Lemma 14 , the minimum number of edges of Gq [W U S] is attained when all clusters are of 
equal size and the number of clusters is maximum possible. We infer that ^VV5-W5 — ^VV5-WcS- 

We are left with iy^Iy^; and iy^)^!^- Recall that A;^™vv counts three edges out of each 6-cycle 
constructed per variable of <&', |A;^™wl ~ ^"-'j whereas k^^^ counts one edge per each vertex 
fc^!!5 = 9m' = |5|. 

Consider a crowded cluster K with c > 1 crowded cliques. We say that K interferes with a 
vertex v & W U S if v is attached to a clique in K. As each clique is attached to exactly 17n' /p 
vertices of W U 5, 2n' /p belonging to W and 15n' /p to 5, in total at most 2an' /p vertices of W 
interfere with a crowded cluster and at most 15an' /p vertices of S. 

Fix a variable x £ Vars(^>'). If none of the vertices 'W^^_^i G W interferes with any crowded 

cluster K, then all the cliques Q^'^f \ 1 < a' < 6, are proper cliques, each contained in a different 
cluster in Gq. Moreover, if additionally no vertex w^^j^^, 1 < a < 6, is alone, then in the 6-cycle 
constructed for the variable x at most three edges are not in F. On the other hand, if some of the 
vertices ^j^^ S W interfere with a crowded cluster K, or at least one of them is alone, it may 
happen that all six edges of this 6-cycle are contained in one cluster of Gq. The total number of 6- 
cycles that contain either alone vertices or vertices interfering with crowded clusters is bounded by 
as every clique is attached to exactly n' /p 6-cycles. In k^^y^ we counted three edges 
per a 6-cycle, while in t^ly^ we counted at most three edges per every 6-cycles except 6-cycles 
that either contain alone vertices or vertices attached to crowded cliques, for which we counted at 
most six edges. Hence, we infer that 

Pw-W - k'w-W < 3(n^l°°^ + an'/p). 

We claim that if a vertex ^ G S (i) is not alone, and (ii) is not attached to a crowded clique, 
and (iii) is not adjacent to any alone vertex in W, then at most one edge from E{{s^ ^},yV) may 
not be in F. Recall that has exactly three neighbors in W, each of them attached to exactly 
two cliques and all these six cliques are pairwise distinct; moreover, is attached only to these 
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six cliques, if /? = 2,3, or only to three out of these six, if /3 = 1. Observe that (i) and (ii) imply 
that ^ is in the same cluster as exactly one of the six cliques attached to his neighbors in W, 
so if it was in the same cluster as two of his neighbors in W, then at least one of them would 
be alone, contradicting (iii). However, if at least one of (i), (ii) or (iii) is not satisfied, then all 
three edges incident to s'^ ^ may be contained in one cluster. As each vertex in W is adjacent to 
at most 18 vertices in S (at most 3 per every clause in which the variable is present), there are 
at most 1871*^^°°*^ vertices ^ that are alone or adjacent to an alone vertex in W. Note also that 
the number of vertices of S interfering with crowded clusters is bounded by 15an'/p, as each of a 
crowded cliques has exactly 15n' /p vertices of S attached. Thus, wc are able to bound the number 
of vertices of S for which (i), (ii) or (iii) does not hold. As in A;^™^ wc counted one edge per every 
vertex of S, while in i^l^ we counted at most one edge per every vertex of S except vertices not 
satisfying (i), (ii), or (iii), for which we counted at most three edges, we infer that 

- A;^!!^ < 2(18n^i'^'^<^ + 15an'/p). 

Summing up all the bounds: 

|F| -k'> {tQ-Q - Uq^q) + {Iq-ws - kQ^ws) + {iws-ws - kws-ws) 

Q / /?savc T^savc \ o / /^savc T^save \ 

— ^l,«VV;_Vy — Kyy_y^;) — Z(^tyy_5 — Kyy_g ) 

> aL'^/2 + n^'°°U - 17aLn'/p + - 6(71^^°"^^ + an'/p) - 4(15n^^°^^ + ISan'/p) 
>a + n^'°°^ > 



The second to last inequality follows from the choice of the value oi L, L = 1000 • ^1 + ^ j ; note 
that in particular L > 1000. 

We infer that a = 0, that is, each clique is contained in a different cluster of Go, and each 
cluster of Go contains exactly one such clique. Moreover, = 0, that is, each vertex v eWUS 

is contained in a cluster with at least one clique v is attached to; as all cliques are proper, v is 
contained in a cluster with exactly one clique v is attached to and Iq-ws = kQ-y^;s. 

Recah that |Fn ((>VU5) x (>VU5))| = £^s-yvs + ^Ws-ws " 2^vv-w " 2^vv-5- As each clique 
is now proper and no vertex is alone, for each variable x at most three edges out of the 6-cycle 
'^a,a+i^ 1 < a < 6, are not in F, that is, ^|^™w — ^w^w- Moreover, for each vertex G S, the 
three neighbors of are contained in different chistcrs and at most one edge incident to is 
not in F, that is, iy^l^ < k^^^. As |F| < fc', these inequalities are tight: exactly three edges out 
of each 6-cycle are not in F, and exactly one edge adjacent to a vertex in S is not in F. 

Consider an assignment cf)' of Vars($') that assigns ^'(x) = 1 if the vertices w^qq+I' 1 < a < 6 

are contained in clusters with cliques Q\^^\ Q'^^'^\ and Q^^^^ (i.e., the edges wf^wf 2) ^23^34 ^'^d 
"W^f 5^5 6 (t>'{x) = otherwise (i.e., if the vertices 1 < q < 6 are contained 

in clusters with cliques Q^2^\ Q^'j^^'^ and Qg^^^) — a direct check shows that these are the only ways 
to save 3 edges inside a 6-cycle. We claim that 0' satisfies Consider a clause G. The vertex 
s^i is contained in a cluster with one of the three cliques it is attached to (as n^^°'^^ = 0) , say Q^, , 
and with one of the three vertices of W it is adjacent to, say ^^i. Therefore r{x) = r, 
is contained in the same cluster as Q^,, and (f>'{x) satisfies the clause G. □ 
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Figure 3: The gadget for a clause C with variables x, y and z. 

5 General clustering under ETH 

In this section we prove Theorem|3j namely that the Cluster Editing problem without restriction 
on the number of clusters in the output does not admit algorithm unless the Exponential 

Time Hypothesis fails. 

The following lemma provides a linear reduction from the problem of verifying satisfiability of 
3-CNF formulas. 

Lemma 16. There exists a polynomial-time algorithm that, given a 3-CNF formula $ with n 
variables andm clauses, constructs a Cluster Editing instance {G, k) such that (i) $ is satisfiable 
if and only if {G,k) is a YES-instance, and (ii) \V{G)\ + |i?(G)| + /e = 0{n + m). 

Proof. By standard arguments, we may assume that each clause of ^ consists of exactly three 
literals with different variables and each variable appears at least twice: at least once in a positive 
literal and at least once in a negative one. Let Vars(<I>) denote the set of variables of For a 
variable x, let Sx be the number of appearances of x in the formula ^. For a clause C with variables 
X, y, and z, we denote by Ix^c the literal of C that contains x (i.e., lx,c = x or Ixfl = -^x). 

Construction. We construct a graph G = {V, E) as follows. First, for each variable x we introduce 
a cycle Ax of length Asx- For each clause G where x appears we assign four consecutive vertices 
1 < j < 4 on the cycle Ax- If the vertices assigned to a clause C' follow the vertices assigned 
to a clause C on the cycle A^, we let cl^ c ~ c ■ 

Second, for each clause G with variables x, y, and z we introduce a gadget with 6 vertices 
Vc = {Px,Py,Pz,qx,qy,qz} with all inner edges except for qxQy, QyQz, and q^Qx (see Figure [s]). 
If ^x,c = X then we connect qx to the vertices a\u and a^, ^, and if Ixfi = we connect 
qr to and a^(j- We proceed analogously for variables y and z in the clause G. We set 
k = 8m + 2 ^^gYj^j.g|-<^-| Sx = 14m. This finishes the construction of the Cluster Editing instance 
{G,k). Clearly \V{G)\ + \E{G)\ + k = 0{n + m). We now prove that (G,/c) is a YES-instance if 
and only if <I> is satisfiable. 

Completeness. Assume that $ is satisfiable, and let be a satisfying assignment for We 
construct a set F C y x y as follows. First, for each variable x we take into F the edges ^o^ 
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a^c^xC each clause C if (l){x) is true and the edges a\c^'xC^ ^xC^tc each clause C 
otherwise. Second, let C be a clause of $ with variables x, y, and z and, without loss of generality, 
assume that the literal Ix^c satisfies C in the assignment (p. For such a clause C we add to F eight 
elements: the edges QxPx, QxPy, QxPz, the four edges that connect qy and qz to the cycles Ay and 
Az, and the non-edge qyqz- 

Clearly \F\ = ^^xeVarsi^) "^^^ + ~ ^- verify that GAF is a cluster graph. For each 

cycle Ax, the removal of the edges in F results in splitting the cycle into 2sx two- vertex clusters. For 
each clause C with variables x, y, z, satisfied by the literal lx,c in the assignment (j), the vertices 
Pxj Pyi Pz, Qy, and qz form a 5-vertex cluster. Moreover, since Ix ,c is true in (j), the edge that 
connects the two neighbors of qx on the cycle Ax is not in F, thus qx and these two neighbors form 
a three- vertex cluster. 

Soundness. Let be a minimum size feasible solution to the Cluster Editing instance (G, k). 
By Lemma[5| for each clause C with variables x, y, and z, the vertices Px, Py, and pz are contained in 
a single cluster in GAF. Denote the vertex set of this cluster by Zc- We choose F (with minimum 
possible cardinality) such that the number of clusters Zq that are contained in the vertex set Vc 
is maximum possible. 

Informally, we are going to show that the solution F needs to look almost like the one constructed 
in the proof of completeness. The crucial observation is that if we want to create a six- vertex cluster 
Zc = then we need to put nine (instead of eight) elements in F that are incident to Vc- Let us 
now proceed to the formal arguments. 

Fix a variable x and let Fx = F D {V{Ax) x V{Ax))- We claim that \Fx\ > 2sx and, moreover, 
if \F\ = 2sx then Fx consists of every second edge of the cycle Ax- Note that AxAFx is a cluster 
graph; assume that there are 7 clusters in AxAFx with sizes aj for 1 < j < 7. If 7 = 1 then, as 
sx > 2, 

\Fx\ = \ai\ = (^''^^^ - Asx = 8sl - Qsx > 2sx. 

Otherwise, in a cluster with aj vertices we need to add at least (°^) — (qj — 1) edges and remove 
at least two edges of Ax leaving the cluster. Using ^ Uj = Asx, we infer that 

\Fx\>l + Y.n)-{a,-l) = -J2 - 3«i + ^ = 2sx + - - 2)2. 

j=i V / j=i j=i 

Thus, \Fx\ > 2sx and l^^,! = 2sx only if for all 1 < j < 7 we have Uj = 2 and in each two-vertex 
cluster of AxAFx, Fx does not contain the edge in this cluster and contains two edges of Ax that 
leave this cluster. This situation occurs only if Fx consists of every second edge of the cycle Ax- 

We now focus on a gadget for some clause C with variables x, y, and z. Let Fc = F r\ iVc x 
{Vc U V{Ax) U V{Ay) U ^(^2))). We claim that \Fc\ > 8 and there are very limited ways in which 
we can obtain \Fc\ = 8. 

Recall that the vertices px, Py, and pz are contained in a single cluster in GAF with vertex set 
Zc- We now distinguish subcases, depending on how many of the vertices qx, qy, and qz are in Zc- 
If qx,qy,qz ^ Zc, then {px,Py,Pz} x {qx,qy,qx} Q Fc and \Fc\ > 9. 

If qx G Zc, but qy, qz ^ Zc, then {px,Py,Pz} x {qy, qz} C Fc- If there is a vertex v £ Zc\ Vc, 
then F needs to contain three elements vpx, vpy, and vp^- In this case F' constructed from F 
by replacing all elements incident to {qx,Px,Py,Pz} with all eight edges of G incident to this set 
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is a feasible solution to {G, k) of size smaller than F, a contradiction to the assumption of the 
minimality of F. Thus, Zc = {qx,Px,Py,Pz}, and Fq contains the eight edges of G incident to Zc- 

If qx,qy e Zc but ^ Zc, then qzPx,qzPy,qzPz,qxqy & Fc- If there is a vertex v G Zc\Vc, 
then Fc contains the three edges vpx, vpy,vpz and at least one of the edges vqx, vqy. In this case F' 
constructed from F by replacing all elements incident to {px-:Py,Pz-,qx-,qy} with all seven edges of 
G incident to this set and a non-edge qxqy is a feasible solution to (G, k) of size not greater than F, 
with Zc ^ Vc, a contradiction to the choice of F. Thus Zc = {Px,Py,Pz,qx,qy} and Fc contains 
all seven edges incident to Zc and the non-edge qxqy 

In the last case, Vc ^ Zc, and qxqy,qyqz,qzqx £ Fc- There are six edges connecting Vc and 
V{Ax) U V{Ay) U V{Az) in G, and all these edges are incident to different vertices of V{Ax) U 
V{Ay) U V{Az). Let TO be one of these six edges, u £ Vc, v ^ Vc- If f G Zc then F contains five 
non-edges connecting v to Vc \ {u}. Otherwise, if w ^ Zc, F contains the edge uv. We infer that 
Fc contains at least six elements that have exactly one endpoint in Vc and \Fc\ > 9. 

We now note that the sets Fc for all clauses C and the sets Fx for all variables x are pairwise 
disjoint. Recall that \Fx\ > 2sx for any variable x and \Fc\ > 8 for any clause C. As \F\ < 14m = 
8m -|- 2sx, we infer that = 2sx for any variable x, \Fc\ = 8 for any clause C and F contains 
no elements that are not in any set Fx or Fc- 

As I I = 2sx for each variable x, the set Fx consists of every second edge of the cycle Ax- We 
construct an assignment (p as follows: (j){x) is true if for all clauses C where x appears we have 

^xC'^xC'^xC^xC ^ ^ '?^(^) false if o-x c^^x c^^x c^t c ^ ^- claim that 4> satisfies <I>. 
Consider a clause C with variables x, y, and z. As \Fc\ = 8, by the analysis above one of two 
situations occur: \Zc\ = 4, say Zc = {Px,Py,Pz,qx}, or \Zc\ = 5, say Zc = {Px,Py,Pz,qx,qy}- In 
both cases, Fc consists only of all edges of G that connect Zc with V{G) \ Zc and the non-edges 
of G[Zc]- Thus, in both cases the two edges that connect Qz with the cycle Az are not in F. Thus, 
the two neighbors of qz on the cycle Az are connected by an edge not in F, and satisfies the 
clause C. □ 

Lemma [16] directly implies the proof of Theorem [3] 

Proof of Theorem^ A subexponential algorithm for Cluster Editing, combined with the re- 



duction shown in Lemma 16, would give a subexponential (in the number of variables and clauses) 
algorithm for verifying satisfiability of 3-CNF formulas. An existence of such algorithm is known 
to violate ETH I29l. □ 



We note that the graph constructed in the proof of Lemma 16 is of maximum degree 5. Thus 
our reduction shows that sparse instances of CLUSTER Editing where in the output the clusters 
are of constant size are hard. 



6 Conclusion and open questions 

We gave an algorithm that solves ^-Cluster Editing in time C)(2'-'(^^) + n + m). We also have 
shown that the running time of our algorithm is asymptotically tight, by presenting a multivari- 
ate lower bound, and that the bound on the number of clusters is essential for subexponential 
tractability. 

In our multivariate lower bound it is crucial that the cliques and clusters are arranged in groups 
of six, so that the cliques adjacent to each clause vertex are always pairwise different. However, 
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the drawback of this construction is that Theorem [2] settles the time complexity of Cluster 
Editing problem for fixed p only for p > 6 (Corollary [2]). It does not seem unreasonable that, for 
example, the 2-Cluster Editing problem, already NP-complete [33], may have enough structure 
to allow a faster algorithm, running in time subexponential in the number of vertices of the graph. 
Can we show such algorithm or refute its existence under ETH? 



Acknowledgements. We thank Christian Komusiewicz for pointing us to the recent results on 
Cluster Editing ^EI] and his thesis [30j . 



References 

[1] N. AiLON, M. Charikar, and a. Newman, Aggregating inconsistent information: ranking and 
clustering, in Proceedings of the 37th Annual ACM Symposium on Theory of Computing (STOC 2005), 
ACM, 2005, pp. 684-693. 

[2] N. AiLON, M. Charikar, and A. Newman, Aggregating inconsistent information: Ranking and 
clustering, J. ACM, 55 (2008), pp. 23:1-23:27. 

[3] N. Alon, D. Lokshtanov, and S. Saurabh, Fast FAST, in Proceedings of the 36th International 
Colloquium on Automata, Languages and Programming (ICALP 2009), vol. 5555 of Lecture Notes in 
Comput. Sci., Springer, 2009, pp. 49-58. 

[4] N. Alon, K. Makarychev, Y. Makarychev, and A. Naor, Quadratic forms on grapfis, in Pro- 
ceedings of the 37th Annual ACM Symposium on Theory of Computing (STOC 2005), ACM, 2005, 
pp. 486-493. 

[5] S. Arora, E. Berger, E. Hazan, G. Kindler, and M. Safra, On non-approximability for 
quadratic programs, in Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer 
Science (FOCS 2005), IEEE Computer Society 2005, pp. 206-215. 

[6] N. Bansal, a. Blum, and S. Chawla, Correlation clustering, Machine Learning, 56 (2004), pp. 89- 
113. 

[7] A. Ben-Dor, R. Shamir, and Z. Yakhini, Clustering gene expression patterns. Journal of Compu- 
tational Biology 6 (1999), pp. 281-297. 
[8] S. B5cker, a golden ratio parameterized algorithm for cluster editing, in IWOCA, C. S. Iliopoulos and 

W. F. Smyth, eds., vol. 7056 of Lecture Notes in Computer Science, Springer, 2011, pp. 85-95. 
[9] S. Booker, S. Briesemeister, Q. B. A. Bui, and A. Truss, A fixed-parameter approach, for 
weighted cluster editing, in Proceedings of the 6th Asia-Pacific Bioinformatics Conference (APBC 2008) , 
vol. 6 of Advances in Bioinformatics and Computational Biology, 2008, pp. 211-220. 

[10] S. Booker, S. Briesemeister, and G. W. Klau, Exact algorithms for cluster editing: Evaluation 
and experiments, Algorithmica, 60 (2011), pp. 316-334. 

[11] S. Booker and P. Damaschke, Even faster parameterized cluster deletion and cluster editing. Inf. 
Process. Lett., Ill (2011), pp. 717-721. 

[12] H. L. Bodlaender, M. R. Fellows, P. Heggernes, F. Mancini, C. Papadopoulos, and F. A. 
Rosamond, Clustering with partial information, Theor. Comput. Sci., 411 (2010), pp. 1202-1211. 

[13] Y. Cao and J. Chen, Cluster editing: Kernelization based on edge cuts, in Proceedings of the 5th 
International Symposium on Parameterized and Exact Computation (IPEC 2010), vol. 6478 of Lecture 
Notes in Computer Science, Springer, 2010, pp. 60-71. 

[14] M. Charikar, V. Guruswami, and A. Wirth, Clustering with qualitative information, in Pro- 
ceedings of the 44th Symposium on Foundations of Computer Science (FOCS 2003), IEEE Computer 
Society 2003, pp. 524-533. 

[15] M. Charikar and A. Wirth, Maximizing quadratic programs: Extending Crothendieck's inequality, in 
Proceedings of the 45th Symposium on Foundations of Computer Science (FOCS 2004), IEEE Computer 
Society 2004, pp. 54-60. 



23 



[16] J. Chen and J. Meng, A 2k kernel for the cluster editing problem, Journal of Computer and System 

Scienees, 78 (2012), pp. 211 - 220. 
[17] P. Damaschke, Fixed-parameter enumerability of cluster editing and related problems, Theory Comput. 

Syst., 46 (2010), pp. 261-283. 
[18] E. D. Demaine, F. V. Fomin, M. Hajiaghayi, and D. M. Thilikos, Subexponential parameterized 

algorithms on graphs of bounded genus and H -minor-free graphs, J. Assoc. Comput. Mach., 52 (2005), 

pp. 866-893. 

[19] R. G. Downey and M. R. Fellows, Parameterized complexity. Springer- Verlag, New York, 1999. 

[20] M. R. Fellows, J. Quo, C. Komusiewicz, R. Niedermeier, and J. Uhlmann, Graph-based data 
clustering with overlaps. Discrete Optimization, 8 (2011), pp. 2-17. 

[21] J. Flum and M. Grohe, Parameterized Complexity Theory, Texts in Theoretical Computer Science. 
An EATCS Series, Springer- Verlag, Berlin, 2006. 

[22] F. V. Fomin and D. Kratsch, Exact Exponential Algorithms, An EATCS Series: Texts in Theoretical 
Computer Science, Springer, 2010. 

[23] F. V. Fomin and Y. Vilanger, Subexponential parameterized algorithm for minimum fill-in, in Pro- 
ceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2012), SIAM, 
2012, pp. 1737-1746. 

[24] I. GiOTiS AND V. GuRUSWAMi, Correlation clustering with a fixed number of clusters, in Proceedings 
of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2006), ACM Press, 2006, 
pp. 1167-1176. 

[25] J. Gramm, J. Guo, F. Huffner, and R. Niedermeier, Craph-modeled data clustering: Exact 
algorithms for clique generation. Theory Comput. Syst., 38 (2005), pp. 373-392. 

[26] J. Guo, A more effective linear kernelization for cluster editing, Theor. Comput. Sci., 410 (2009), 
pp. 718-726. 

[27] J. Guo, I. A. Kanj, C. Komusiewicz, and J. Uhlmann, Editing graphs into disjoint unions of 
dense clusters, Algorithmica, 61 (2011), pp. 949-970. 

[28] J. Guo, C. Komusiewicz, R. Niedermeier, and J. Uhlmann, A more relaxed model for graph-based 
data clustering: s-plex cluster editing, SIAM J. Discrete Math., 24 (2010), pp. 1662-1683. 

[29] R. Impagliazzo, R. Paturi, and F. Zane, Which problems have strongly exponential complexity?, 
J. Comput. Syst. Sci., 63 (2001), pp. 512-530. 

[30] C. Komusiewicz, Parameterized Algorithmics for Network Analysis: Clustering & Querying, PhD the- 
sis, Technische Universitat Berlin, 2011. Available at http : //f pt . akt . tu-berlin. de/publications/] 
|diss-komusiewicz .pdf [ 

[31] C. Komusiewicz and J. Uhlmann, Alternative parameterizations for cluster editing, in SOFSEM, 
I. Cerna, T. Gyimothy, J. Hromkovic, K. G. Jeffery, R. Kralovic, M. Vukolic, and S. Wolf, eds., vol. 6543 
of Lecture Notes in Computer Science, Springer, 2011, pp. 344-355. 

[32] F. Protti, M. D. da Silva, and J. L. Szwarcfiter, Applying modular decomposition to parame- 
terized cluster editing problems. Theory Comput. Syst., 44 (2009), pp. 91-104. 

[33] R. Shamir, R. Sharan, and D. Tsur, Cluster graph modification problems. Discrete Applied Math- 
ematics, 144 (2004), pp. 173-182. 



24 



